Unify interpreter and witness generation logic. #56

LindaGuiga · 2024-02-22T15:54:58Z

This PR aims at unifying the logic of the interpreter and that of the CPU, for two main reasons:

the interpreter is now used by jumpdest_analysis during generation,
the continuation will also need the interpreter to generate segments.

The discrepancy between the interpreter and CPU logic has led to some bugs in those two use cases, and the unification would make the debugging and usage much easier.

Another change in the Interpreter is the removal of the lifetime:

prover_inputs are already available in the generation_state, so we use these instead of the Interpreter's specific field
since all tests -- except one small tests only aimed at testing the interpreter run -- use the kernel code, this PR always assumes that the Interpreter's code is the kernel code, which means we can also remove the code field of the Interpreter (which also required a lifetime).

Since one of the motivations for this PR is to prepare for continuation, it also includes the following changes, which make segment generation much easier:

the values in a MemoryState are stored as Option<U256>,
the initial MPT values are not stored in the TrieData segment if we are in the interpreter, but instead stored in the new interpreter field preinitialized_segments.

Those two changes make it easier to determine accurately which parts of a memory have been accessed during one segment execution.

evm_arithmetization/src/cpu/kernel/interpreter.rs

dvdplm · 2024-02-23T08:45:24Z

evm_arithmetization/src/cpu/kernel/interpreter.rs

+            let get_vals = |opt_vals: &[Option<U256>]| {
+                opt_vals
+                    .iter()
+                    .map(|&elt| match elt {
+                        Some(val) => val,
+                        None => U256::zero(),
+                    })
+                    .collect::<Vec<U256>>()


There's a lot of this stuff going on in the PR, a consequence of using Option<U256>. Are we sure we can't avoid wrapping values in Option? Can we just use 0 or U256::MAX or some other trickery?

The problem is that we need to know whether a certain memory spot was set or not. But memory values can take any U256 value, so unfortunately, I think we do need the Option<U256>...

evm_arithmetization/src/cpu/kernel/interpreter.rs

dvdplm

(partial review)

4l0n50 · 2024-02-23T09:37:52Z

PR always assumes that the Interpreter's code is the kernel code, which means we can also remove the code field of the Interpreter (which also required a lifetime).

But ad-hoc code might be useful for testing macross or any code not in the kernel.

evm_arithmetization/src/generation/mod.rs

dvdplm · 2024-02-23T13:27:54Z

But ad-hoc code might be useful for testing macross or any code not in the kernel.

Maybe we can revisit this when we have a concrete need for it? Or is this a blocker in your opinion?

evm_arithmetization/src/witness/transition.rs

4l0n50 · 2024-02-23T14:25:02Z

But ad-hoc code might be useful for testing macross or any code not in the kernel.

Maybe we can revisit this when we have a concrete need for it? Or is this a blocker in your opinion?

I'm not strong about this. The only problem could be that when we have this test (I have something like that on another branch) we would need to come back to the previous more general version of the interpreter where the code can be anything. But maybe there's an easy trick that I can't see now (here's the test)

evm_arithmetization/src/witness/transition.rs

hratoanina

First pass.
I'll take another look later.

evm_arithmetization/src/generation/mod.rs

hratoanina · 2024-02-23T22:05:30Z

evm_arithmetization/src/generation/mod.rs

+        }
+    }
+
+    /// Returns the value of a `State`'s clock.


Maybe not in this PR, but we should unify the clock someday. I think we should add a clock register for the prover too, but it's not a strong opinion either.

evm_arithmetization/src/cpu/kernel/interpreter.rs

evm_arithmetization/src/witness/memory.rs

evm_arithmetization/src/witness/operation.rs

evm_arithmetization/src/witness/memory.rs

evm_arithmetization/src/witness/transition.rs

evm_arithmetization/src/cpu/kernel/interpreter.rs

* Add traits State and Transition * Move perform_state_op to Transition * Fix bug * Get rid of is_generation_state + add reviews * Remove TDO * Fix handle_error * Move preinitialized_segments from the interpreter to MemoryState * Clippy * Apply get_halt_context comment --------- Co-authored-by: Linda Guiga <lindaguiga3@gmail.com>

Nashtare

Thanks Linda! I may need to get back to it one more time, but leaving some comments in the meantime for discussion

Nashtare · 2024-02-28T03:15:33Z

evm_arithmetization/src/cpu/kernel/tests/core/jumpdest_analysis.rs

@@ -87,7 +87,7 @@ fn test_jumpdest_analysis() -> Result<()> {
        .get_mut(&CONTEXT)
        .unwrap()
        .pop();
-    interpreter.push(U256::one());
+    interpreter.push(41.into());


Why is it changed?

With the unification, the proofs are checked in the reverse order. I'm still not sure why, but it must come from a previous discrepancy between the interpreter and witness generation that we hadn't identified...

cc @4l0n50 in case you have an explanation

Hmmm. Maybe is because the jumpdests where sorted at some point but is not done anymore? (as it was no necessary)

evm_arithmetization/src/cpu/kernel/tests/ecc/curve_ops.rs

evm_arithmetization/src/generation/prover_input.rs

evm_arithmetization/src/witness/util.rs

evm_arithmetization/src/generation/state.rs

evm_arithmetization/src/witness/operation.rs

evm_arithmetization/src/witness/transition.rs

evm_arithmetization/src/cpu/kernel/interpreter.rs

Nashtare

Thank you Linda! Looks good to me!

Nashtare · 2024-02-29T03:21:00Z

evm_arithmetization/src/generation/state.rs

@@ -432,7 +483,7 @@ impl<F: Field> State<F> for GenerationState<F> {
    fn incr_interpreter_clock(&mut self) {}


I think I'd rather have the trait define the blanket no-op impl, so that we don't have to keep a reference to something that's interpreter specific here.

Actually, in the end, I think it's better to increment the interpreter clock in push_cpu, since this is when the generation "clock" is increased. This would get rid of that method altogether, and would insure a better unification (we noticed that the clock of the interpreter was not increased when calling generate_exception)

Ah yeah it's even better

LindaGuiga added 3 commits February 22, 2024 15:32

Unify interpreter and CPU logic.

445d055

Cleanup

b7b29d8

Update CHANGELOG

ee3852d

4l0n50 reviewed Feb 22, 2024

View reviewed changes

evm_arithmetization/src/cpu/kernel/interpreter.rs Outdated Show resolved Hide resolved

LindaGuiga added 2 commits February 22, 2024 17:37

Cleanup

138b416

Merge branch main into interpreter-unified

300e287